MIN-Entropy: A New Signature File Declustering Algorithm for Intra-Query Parallelism
نویسندگان
چکیده
Intra-query parallelism is important for achieving high performance in parallel processing environment. For processing a signatrue file in parallel, an effective declustering algon’thm that must avoid data skew and execution skew is needed. The Linear Code Decomposition Method(LCDM) that is used for the Hamming filter may give a good performance, but it fails to evenly decluster a signature file when data is skewed. In addition, it has other problems such as limited scalability and non-determinism. In this paper we propose a new signature file declustering algorithm, called MIN-entropy, that overcomes those problems in the LCDM. The MIN-entropy declusters a signature file based on a measure of execution load uniformity, called signature entropy, that is derived from the previously declustered signatures. Since the MIN-entropy uses dynamic information such as accumulated signatures, it provides high performance for a variety of workloads and configurations. We show through the simulation experiments that the MIN-entropy outperforms the LCDM under various data distributions.
منابع مشابه
Applying SD-Tree for Object-Oriented query processing
We follow signature-based approach to object-oriented query handling in this paper. The use of signature files as an index for full text search has been widely known and used. Signature file based access methods initially applied on text have now been used to handle set-oriented queries in ObjectOriented Data Bases (OODB). All the proposed methods use either efficient search method or tree base...
متن کاملDynamic Allocation of Signature Files on Parallel Devices
Signature file is one of the efficient access methods for retrieval of text database. In a large database server, parallel device is utilized to achieve concurrent access. Efficient allocation of signature file on parallel device minimizes the query response time and is important in the design of large databases. In this paper, we investigate the design of parallel signatutz file. We propose a ...
متن کاملCMD: A Multidimensional Declustering Method
I/O parallelism appears to be a promising approach to achieving high performance in parallel database systems. In such systems, it is essential to decluster database les into fragments and spread them across multiple disks so that the DBMS software can exploit the I/O bandwidth reading and writing the disks in parallel. In this paper, we consider the problem of declustering multidimensional dat...
متن کاملCmd: a Multidimensional Declustering Method for Parallel Database Systems 1
I/O parallelism appears to be a promising approach to achieving high performance in parallel database systems. In such systems, it is essential to decluster database les into fragments and spread them across multiple disks so that the DBMS software can exploit the I/O bandwidth reading and writing the disks in parallel. In this paper, we consider the problem of declustering multidimensional dat...
متن کاملA Hierarchical Technique for Constructing Efficient Declustering Schemes for Range Queries
Multi-disk systems, coupled with declustering schemes, have been widely used in various applications to improve I/O performance by enabling parallel disk accesses. A declustering scheme determines how data blocks should be placed among multiple disks to maximize the parallelism. We focus on the problem of declustering grid-structured multidimensional data with the objective of reducing the resp...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1997